Giving the Power to Bilingual Speakers Position paper for the APE Workshop

نویسنده

  • Ariadna Font Llitjós
چکیده

Post-editing has traditionally been used to improve the quality of MT output to make it usable. More recently, post-editing has also been shown to be useful for automatic evaluation of MT systems. As an alternative to current automatic evaluation metrics, such as BLEU and NIST, which are criticized for not having a high correlation with human judgments, minimal post-editing of MT output provides a good measure of how good a system translation is, and as an end result it also provides us with human reference translations that are relevant to MT systems. Such system friendly translations can then be used to automatically evaluate systems much more fairly, since systems are not penalized for having picked a different sense of the word or having picked a different, yet correct, syntactic construction (Snover et al., 2006). Mostly for practical reasons, post-editing has been a monolingual task done by linguists or editors without the source language information available. However, not having access to the original meaning can hinder the post-editing activity, and can even yield incorrect reference translations. In order to palliate this problem, for the GALE evaluation, multiple translators had to agree on a gold standard reference translation that conveys the exact same meaning as the source language sentence. This gold standard is then used by post-editors as if it were the original source language sentence when post-editing the raw MT output. Nevertheless, not only is the crafting of such a gold standard a difficult task, but in addition, there can be no guarantees that the real meaning of the source language sentence is fully captured by the gold standard translation. For language pairs with a large number of speakers, and certainly for the pairs researchers are working on for GALE (Olive, 2005), Chinese-English and Arabic-English, it would certainly be easier to get bilingual speakers to be post-editors, than having to generate a gold standard for each translation. In this context, I use the term bilingual speaker loosely. When I say bilingual speakers, I do not mean people that were born and raised bilingually and have native skills in both languages, but rather people that are native in one of the two languages and are fluent in the second language. A good example of what I mean by bilingual speakers here could be second generation Chinese and Arabic people that were born and live in the United States or another English-speaking country. …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Who Is a Bilingual?

The question of who is and who is not a bilingual is more difficult to answer than it first appears. Bilingualism was long regarded as the equal mastery of two languages, a definition that still prevails in certain glossaries of linguistics. However, today's complex world requires a more exact definition and analysis of the competencies that community members require to interact with speakers o...

متن کامل

Production of Greek and Turkish vowels by bilingual speakers

The present study examines the acoustic vowel space of Greek and Turkish vowels produced by bilingual speakers. The results are presented for both languages’ vowels production by bilingual speakers, as well as compared with the vowels production by monolingual speakers. It is revealed that the acoustic space of the Greek vowels produced by bilingual speakers is larger than the acoustic space of...

متن کامل

Native and Non-native Perception of Stress in Mapudungun: Assessing Structural Maintenance in the Phonology of an Endangered Language.

Today, virtually all speakers of Mapudungun (formerly Araucanian), an endangered language of Chile and Argentina, are bilingual in Spanish. As a result, the firmness of native speaker intuitions-especially regarding perceptually complex issues such as word-stress-has been called into question. Even though native intuitions are unavoidable in the investigation of stress position, efforts can be ...

متن کامل

Stuttering Prevalence among Kurdish-Farsi Students Effects of the Two Languages Similarities

Objectives: It has been noted that stuttering is more prevalent in bilinguals than in monolinguals. The similarities of the languages involved have been mentioned to justify the difference between stuttering prevalence among bilingual and monolingual speakers. The aim of this study is to investigate the effect of language similarities on prevalence of stuttering among Kurdish-Farsi bilingual st...

متن کامل

Comparison of Approaches for Language Revitalization of Northern Khmer in Thailand

Although 1.4 million people speak Northern Khmer in Thailand, they are aware that their language is still in decline. To deal with this threat, native speakers have cooperated with linguists from Mahidol University to work on a community-based research project since 2007. Teaching the Northern Khmer language as a subject in the formal school system was the first project which started at Ban Pho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006